NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

NysAct: A Scalable Preconditioned Gradient Descent using Nyström Approximation

https://doi.org/10.1109/BigData62323.2024.10825352

Seung, Hyunseok; Lee, Jaewoo; Ko, Hyunsuk (December 2024, IEEE)

Adaptive gradient methods are computationally efficient and converge quickly, but they often suffer from poor generalization. In contrast, second-order methods enhance convergence and generalization but typically incur high computational and memory costs. In this work, we introduce NYSACT, a scalable first-order gradient preconditioning method that strikes a balance between state-of-the-art first-order and second-order optimization methods. NYSACT leverages an eigenvalue-shifted Nyström method to approximate the activation covariance matrix, which is used as a preconditioning matrix, significantly reducing time and memory complexities with minimal impact on test accuracy. Our experiments show that NYSACT not only achieves improved test accuracy compared to both first-order and second-order methods but also demands considerably less computational resources than existing second-order methods.
more » « less
Full Text Available
Enhancing Program Analysis with Deterministic Distinguishable Calling Context

https://doi.org/10.1145/3708493.3712679

Kim, Sungkeun; Nguyen, Khanh; Tsai, Chia-Che; Lee, Jaewoo; Muzahid, Abdullah; Kim, Eun Jung (February 2025, ACM)

Calling context is crucial for improving the precision of program analyses in various use cases (clients), such as profiling, debugging, optimization, and security checking. Often the calling context is encoded using a numerical value. We have observed that many clients benefit not only from a deterministic but also globally distinguishable value across runs to simplify bookkeeping and guarantee complete uniqueness. However, existing work only guarantees determinism, not global distinguishability. Clients need to develop auxiliary helpers, which incurs considerable overhead to distinguish encoded values among all calling contexts. In this paper, we propose Deterministic Distinguishable Calling Context Encoding () that can enable both properties of calling context encoding natively. The key idea of is leveraging the static call graph and encoding each calling context as the running call path count. Thereby, a mapping is established statically and can be readily used by the clients. Our experiments with two client tools show that has a comparable overhead compared to two state-of-the-art encoding schemes, PCCE and PCC, and further avoids the expensive overheads of collision detection, up to 2.1× and 50%, for Splash-3 and SPEC CPU 2017, respectively.
more » « less
Free, publicly-accessible full text available February 25, 2026
Characterizing losses in InAs two-dimensional electron gas-based gatemon qubits

https://doi.org/10.1103/PhysRevResearch.6.023094

Strickland, William M; Baker, Lukas J; Lee, Jaewoo; Dindial, Krishna; Elfeky, Bassel Heiba; Strohbeen, Patrick J; Hatefipour, Mehdi; Yu, Peng; Levy, Ido; Issokson, Jacob; et al (April 2024, Physical Review Research)

Full Text Available
MBAG: A Scalable Mini-Block Adaptive Gradient Method for Deep Neural Networks

https://doi.org/10.1109/BigData55660.2022.10020262

Lee, Jaewoo (December 2022, IEEE)

Full Text Available
Quasiparticle Dynamics in Epitaxial $Al$ - $In As$ Planar Josephson Junctions

https://doi.org/10.1103/prxquantum.4.030339

Elfeky, Bassel Heiba; Strickland, William M; Lee, Jaewoo; Farmer, James T; Shanto, Sadman; Zarassi, Azarin; Langone, Dylan; Vavilov, Maxim G; Levenson-Falk, Eli M; Shabani, Javad (September 2023, PRX Quantum)

Full Text Available
Differentially Private Normalizing Flows for Synthetic Tabular Data Generation

https://doi.org/10.1609/aaai.v36i7.20697

Lee, Jaewoo; Kim, Minjung; Jeong, Yonghyun; Ro, Youngmin (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

Normalizing flows have shown to be a promising approach to deep generative modeling due to their ability to exactly evaluate density --- other alternatives either implicitly model the density or use approximate surrogate density. In this work, we present a differentially private normalizing flow model for heterogeneous tabular data. Normalizing flows are in general not amenable to differentially private training because they require complex neural networks with larger depth (compared to other generative models) and use specialized architectures for which per-example gradient computation is difficult (or unknown). To reduce the parameter complexity, the proposed model introduces a conditional spline flow which simulates transformations at different stages depending on additional input and is shared among sub-flows. For privacy, we introduce two fine-grained gradient clipping strategies that provide a better signal-to-noise ratio and derive fast gradient clipping methods for layers with custom parameterization. Our empirical evaluations show that the proposed model preserves statistical properties of original dataset better than other baselines.
more » « less
Full Text Available
Wasserstein Adversarial Transformer for Cloud Workload Prediction

https://doi.org/10.1609/aaai.v36i11.21509

Arbat, Shivani; Jayakumar, Vinodh Kumaran; Lee, Jaewoo; Wang, Wei; Kim, In Kee (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

Predictive VM (Virtual Machine) auto-scaling is a promising technique to optimize cloud applications’ operating costs and performance. Understanding the job arrival rate is crucial for accurately predicting future changes in cloud workloads and proactively provisioning and de-provisioning VMs for hosting the applications. However, developing a model that accurately predicts cloud workload changes is extremely challenging due to the dynamic nature of cloud workloads. Long- Short-Term-Memory (LSTM) models have been developed for cloud workload prediction. Unfortunately, the state-of-the-art LSTM model leverages recurrences to predict, which naturally adds complexity and increases the inference overhead as input sequences grow longer. To develop a cloud workload prediction model with high accuracy and low inference overhead, this work presents a novel time-series forecasting model called WGAN-gp Transformer, inspired by the Transformer network and improved Wasserstein-GANs. The proposed method adopts a Transformer network as a generator and a multi-layer perceptron as a critic. The extensive evaluations with real-world workload traces show WGAN- gp Transformer achieves 5× faster inference time with up to 5.1% higher prediction accuracy against the state-of-the-art. We also apply WGAN-gp Transformer to auto-scaling mechanisms on Google cloud platforms, and the WGAN-gp Transformer-based auto-scaling mechanism outperforms the LSTM-based mechanism by significantly reducing VM over-provisioning and under-provisioning rates.
more » « less
Full Text Available
Differentially Private Goodness-of-Fit Tests for Continuous Variables

https://doi.org/10.1016/j.ecosta.2021.09.007

Kwak, Seung Woo; Ahn, Jeongyoun; Lee, Jaewoo; Park, Cheolwoo (October 2021, Econometrics and Statistics)

Full Text Available
Performance Testing for Cloud Computing with Dependent Data Bootstrapping

https://doi.org/10.1109/ASE51524.2021.9678687

He, Sen; Liu, Tianyi; Lama, Palden; Lee, Jaewoo; Kim, In Kee; Wang, Wei (November 2021, IEEE/ACM International Conference on Automated Software Engineering, 2021)

Full Text Available
Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping

https://doi.org/10.2478/popets-2021-0008

Lee, Jaewoo; Kifer, Daniel (January 2021, Proceedings on Privacy Enhancing Technologies)
null (Ed.)
Abstract Recent work on Renyi Differential Privacy has shown the feasibility of applying differential privacy to deep learning tasks. Despite their promise, however, differentially private deep networks often lag far behind their non-private counterparts in accuracy, showing the need for more research in model architectures, optimizers, etc. One of the barriers to this expanded research is the training time — often orders of magnitude larger than training non-private networks. The reason for this slowdown is a crucial privacy-related step called “per-example gradient clipping” whose naive implementation undoes the benefits of batch training with GPUs. By analyzing the back-propagation equations we derive new methods for per-example gradient clipping that are compatible with auto-differeniation (e.g., in Py-Torch and TensorFlow) and provide better GPU utilization. Our implementation in PyTorch showed significant training speed-ups (by factors of 54x - 94x for training various models with batch sizes of 128). These techniques work for a variety of architectural choices including convolutional layers, recurrent networks, attention, residual blocks, etc.
more » « less
Full Text Available

« Prev Next »

Search for: All records